Robust speech/non-speech detection using LDA applied to MFCC
نویسندگان
چکیده
In speech recognition, a speech/non-speech detection must be robust to noise. In this work, a new method for speech/nonspeech detection using a Linear Discriminant Analysis (LDA) applied to Mel Frequency Cepstrum Coefficients (MFCC) is presented. The energy is the most discriminant parameter between noise and speech. But with this single parameter, the speech/non-speech detection system detects too many noise segments. The LDA applied to MFCC and the associated test reduces the detection of noise segments. This new algorithm is compared to the one based on signal to noise ratio (SNR) [1].
منابع مشابه
Robust speech/non-speech detection using LDA applied to MFCC for continuous speech recognition
Continuous speech recognition applications need precise detection because the number of words to recognize is unknown and vocabulary words can be short. The speech/non-speech detection must be robust to the boundary precision. In this work, a new approach to evaluate detection algorithm for continuous speech recognition is presented. The speech/non-speech detection using energy parameter combin...
متن کاملImproving the performance of MFCC for Persian robust speech recognition
The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...
متن کاملRobust Speech Detection with Heteroscedastic Discriminant Analysis Applied to the Time-frequency Energy
In this paper, we propose a robust speech detection algorithm with Heteroscedastic Discriminant Analysis (HDA) applied to the Time-Frequency Energy (TFE). The TFE consists of the log energy in time domain, the log energy in the fixed band 2503500 Hz, and the log Mel-scale frequency bands energy. The bottom-up algorithm with automatic threshold adjustment is used for accurate word boundary detec...
متن کاملKonuşma Tanima İçi̇n Heteroskedasti̇k Ayirtaç Anali̇zi̇ni̇n Düzenli̇leşti̇ri̇lmesi̇ Regularizing Heteroschedastic Discriminant Analysis for Speech Recognition
Linear Discriminant Analysis (LDA) followed by a diagonalizing maximum likelihood linear transform (MLLT) applied to spliced static MFCC features yields important performance gains as compared to MFCC+dynamic features in most speech recognition tasks. It is reasonable to regularize LDA transform computation for stability. In this paper, we regularize LDA and heteroschedastic LDA transforms usin...
متن کاملA weight estimation method using LDA for multi-band speech recognition
This paper proposes a band-weight estimation method using Linear Discriminant Analysis (LDA) for multi-band automatic speech recognition (ASR). In our scheme, a spectral domain feature, SPEC, is modeled using a multi-stream HMM technique. This paper also proposes the use of Output Likelihood Normalization (OLN) in combination with the LDA-based weight-estimation method in order to adjust the re...
متن کامل